Preface

Open Rstudio to do the practicals. Note that tasks with * are optional.

R packages

In this practical, a number of R packages are used. The packages used (with versions that were used to generate the solutions) are:

  • survival (version: 3.2.11)

R version 4.1.0 (2021-05-18)

Dataset

For this practical, we will use the heart and retinopathy data sets from the survival package. More details about the data sets can be found in:

https://stat.ethz.ch/R-manual/R-devel/library/survival/html/heart.html

https://stat.ethz.ch/R-manual/R-devel/library/survival/html/retinopathy.html

Indexing and Subsetting

Sometimes we want to obtain a subset of the data sets before investigating the descriptive statistics and performing the statistical analysis.

Indexing

Task 1

Using the heart data set:

  • Select the first row.
  • Select the first column.
  • Select the column surgery.

Solution 1

heart[1, ]
heart[, 1]
##   [1]   0.0   0.0   0.0   1.0   0.0  36.0   0.0   0.0   0.0  51.0   0.0   0.0   0.0  12.0   0.0
##  [16]  26.0   0.0   0.0  17.0   0.0  37.0   0.0   0.0  28.0   0.0   0.0  20.0   0.0   0.0  18.0
##  [31]   0.0   8.0   0.0  12.0   0.0   3.0   0.0  83.0   0.0  25.0   0.0   0.0   0.0  71.0   0.0
##  [46]   0.0  16.0   0.0   0.0  17.0   0.0  51.0   0.0  23.0   0.0   0.0  46.0   0.0  19.0   0.0
##  [61]   4.5   0.0   2.0   0.0  41.0   0.0  58.0   0.0   0.0   0.0   0.0   1.0   0.0   2.0   0.0
##  [76]  21.0   0.0   0.0  36.0   0.0  83.0   0.0  32.0   0.0   0.0  41.0   0.0   0.0  10.0   0.0
##  [91]  67.0   0.0   0.0  21.0   0.0  78.0   0.0   3.0   0.0   0.0   0.0  27.0   0.0  33.0   0.0
## [106]  12.0   0.0   0.0  57.0   0.0   3.0   0.0  10.0   0.0   5.0   0.0  31.0   0.0   4.0   0.0
## [121]  27.0   0.0   5.0   0.0   0.0  46.0   0.0   0.0 210.0   0.0  67.0   0.0  26.0   0.0   6.0
## [136]   0.0   0.0  32.0   0.0  37.0   0.0   0.0   8.0   0.0  60.0   0.0  31.0   0.0 139.0   0.0
## [151] 160.0   0.0   0.0 310.0   0.0  28.0   0.0   4.0   0.0   2.0   0.0  13.0   0.0  21.0   0.0
## [166]  96.0   0.0   0.0  38.0   0.0   0.0   0.0
heart["surgery"]
heart[["surgery"]]
##   [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
##  [48] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 1 1 0 0 1 1 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 1 1
##  [95] 1 1 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0
## [142] 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 1 1 0 0 0
heart[, "surgery"]
##   [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
##  [48] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 1 1 0 0 1 1 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 1 1
##  [95] 1 1 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0
## [142] 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 1 1 0 0 0

Task 2

Create a matrix that takes the values 1:4 and has 2 rows and 2 columns. You can name this object mat. Select the second row of all columns.

Solution 2

mat <- matrix(1:4, 2, 2)
mat[2, ]
## [1] 2 4

Task 3

Create an array that consists of 2 matrices. Matrix 1 will consist of the values 1:4 and matrix 2 will consist of the values 5:8. Both matrices will have 2 columns and 2 rows. Give the name ar1 to the this array. Select the 2nd row of all columns from each matrix.

Solution 3

ar1 <- array(data = 1:8, dim = c(2, 2, 2))
ar1[2, , ]
##      [,1] [,2]
## [1,]    2    6
## [2,]    4    8

Subsetting

Task 1

Using the retinopathy data set:

  • Select the futime for all adult patients.
  • Select all the variables for patients that received treatment.

Solution 1

retinopathy$futime[retinopathy$type == "adult"]
##   [1] 46.23 46.23 58.07 13.83 46.43 48.53 44.40  7.90 39.57 39.57 30.83 38.57 66.27 14.10 58.43
##  [16] 41.40 57.43 57.43 61.40  0.60 60.27 26.37  5.77  1.33 25.63 21.90 46.90 22.00 25.80 13.87
##  [31]  5.73 48.30  9.90  9.90 46.73  2.67 18.73 13.83 32.03  4.27 69.87 13.90 56.57 56.57  8.30
##  [46]  8.30 21.57 18.43 31.63 31.63 39.77 39.77 52.33  5.83  4.10 12.20 38.07 12.73 54.10 54.10
##  [61] 50.47 50.47 38.83 38.83 26.23 40.03 38.07 38.07 65.23 65.23  7.07 66.77  9.63  9.63 33.63
##  [76] 33.63 63.33 27.60 38.47  1.63 55.23 55.23 52.77 25.30  9.87  1.70 38.77 19.40 13.83  1.57
##  [91] 46.50 13.37 42.47 22.20 38.73 38.73 51.13 51.13 55.33 55.33 12.93  4.97 54.20 26.47 24.43
## [106]  9.87 50.23 50.23 42.23 42.23 66.93 66.93 67.47 38.57  3.67  3.67 20.07  8.83 55.13 55.13
## [121] 42.20 42.20 38.27 38.27 63.63 26.17 54.37 54.37 54.60 10.97 63.87 21.10 62.37 43.70 62.80
## [136] 62.80 63.33 14.37 58.53 58.53 58.07 58.07 58.50 58.50  1.50 14.37 51.10 51.10 49.93  6.57
## [151] 46.27 46.27 10.60 10.60 42.77 42.77 74.97 61.83 62.00 62.00 51.60 42.33 49.97  2.90 41.93
## [166] 41.93
retinopathy[retinopathy$trt == 1, ]

Task 2

Using the retinopathy data set:

  • Select the age for patients that have futime more than 20.
  • Select the age for patients that have futime more than 20 and are adults.
  • Select patients that have no missing values in age.

Solution 2

retinopathy$age[retinopathy$futime > 20]
##   [1] 28 28 12 12  9  9  9  9 13 12 12  8 12 12 21 23 23 44 47 47 48 48 26 10 23 23  5  5 46 46  5
##  [32]  5 13 13 45  1  1 12 12 36 36 10 25 25 14 38 38 14 14 10 10 17 17 44 21 19 19 13  9  9 48 24
##  [63] 55 17 17  5  5 19 12 12 45 45 43  4  4 45 45 32 32 13 13 15 10 10 17 17 37 18 13 14 14 12 12
##  [94]  9  9 10 10  5  5  7  2  5  5  4  4 27 53 53 10 13 12 12 24 24 17 17  8  8 58 58 17 17 12 12
## [125] 25 25 15 21 21 20 20 23  5  5  8 30 30  7  7 39 39 26 50 50 34 34 10 10 13 13 11 11  9  5  5
## [156] 10 10 23  2  2 12 12  7  7 13 20 30 30 32 32 39 39  4 10  6  6 33 33 15 15 48 48  4  4 46 25
## [187] 25 12 12 12 26 26 11 11 36 36 12 12 50 50  8  8 14 14 56  9  9  5  5  1  1 57 57 33 33 46 46
## [218] 35 35  8  8 30 30 51 42 42 20 20 23 23 22 25 25 45 45 20 20 19 19  4 36 36 20 24 24 51 51 16
## [249] 16 16 16 10 10 20 20 10 16 16 10 10 11 11 17 17  7  7 29 29  5 22 22 33  3 32 32
retinopathy$age[retinopathy$futime > 20 & retinopathy$type == "adult"]
##   [1] 28 28 21 23 23 44 47 47 48 48 26 23 23 46 46 45 36 36 25 25 38 38 44 21 48 24 55 45 45 43 45
##  [32] 45 32 32 37 27 53 53 24 24 58 58 25 25 21 21 20 20 23 30 30 39 39 26 50 50 34 34 23 20 30 30
##  [63] 32 32 39 39 33 33 48 48 46 25 25 26 26 36 36 50 50 56 57 57 33 33 46 46 35 35 30 30 51 42 42
##  [94] 20 20 23 23 22 25 25 45 45 20 20 36 36 20 24 24 51 51 20 20 29 29 22 22 33 32 32
retinopathy[!is.na(retinopathy$age), ]

Task 3*

Using the retinopathy data set:

  • Select only the rows of the left eye.
  • Select only the rows of adult patients.

Solution 3*

retinopathy[retinopathy$eye == "left", ]
retinopathy[retinopathy$type == "adult", ]
 

© Eleni-Rosalina Andrinopoulou